Using Wikipedia for Named-Entity Translation
نویسندگان
چکیده
In this paper we present a system for translating named-entities from Basque to English using Wikipedia’s knowledge. We can exploit interlingual links from Wikipedia (WIL) to get named-entity translation, but entities without interlingual links can be translated using the Wikipedia as a corpus, suggesting new interlingual links. In this second case the interlingual links can be used as a test corpus in order to evaluate the translation process. We just need Wikipedia articles in both languages (specially in the target language) and a bilingual dictionary to apply this methodology to other language pairs.
منابع مشابه
بهبود شناسایی موجودیتهای نامدار فارسی با استفاده از کسره اضافه
Named entity recognition is a process in which the people’s names, name of places (cities, countries, seas, etc.) and organizations (public and private companies, international institutions, etc.), date, currency and percentages in a text are identified. Named entity recognition plays an important role in many NLP tasks such as semantic role labeling, question answering, summarization, machine ...
متن کاملNamed entities from Wikipedia for machine translation
In this paper we present our attempt to improve machine translation of named entities by using Wikipedia. We recognize named entities based on categories of English Wikipedia articles, extract their potential translations from corresponding Czech articles and incorporate them into a statistical machine translation system as translation options. Our results show a decrease of translation quality...
متن کاملNamed Entity Recognition in Persian Text using Deep Learning
Named entities recognition is a fundamental task in the field of natural language processing. It is also known as a subset of information extraction. The process of recognizing named entities aims at finding proper nouns in the text and classifying them into predetermined classes such as names of people, organizations, and places. In this paper, we propose a named entity recognizer which benefi...
متن کاملABRIR at NTCIR-9 GeoTime Task Usage of Wikipedia and GeoNames for Handling Named Entity Information
ABRIR at NTCIR-8 Approach C t ti f N d E tit d t b Construction of Boolean query by using pseudo relevant documents Named entity is more important than others Verb synonym list is used for increasing coverage Combination of Boolean IR model and probabilistic IR model (Okapi) Penalty is applied for documents that don’t satisfy Boolean query Named entity is more important than others F...
متن کاملPOLYGLOT-NER: Massive Multilingual Named Entity Recognition
The increasing diversity of languages used on the web introduces a new level of complexity to Information Retrieval (IR) systems. We can no longer assume that textual content is written in one language or even the same language family. In this paper, we demonstrate how to build massive multilingual annotators with minimal human expertise and intervention. We describe a system that builds Named ...
متن کامل